Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add filters

Database
Language
Document Type
Year range
1.
Multimed Tools Appl ; : 1-14, 2023 May 29.
Article in English | MEDLINE | ID: covidwho-20231974

ABSTRACT

Due to its spread via physical contact and the regulations on wearing face masks, COVID-19 has resulted in tough challenges for speaker recognition. Masks may aid in preventing COVID-19 transmission, although the implications of the mask on system performance in a clean environment and with varying levels of background noise are unclear. The face mask has an impact on speech output. The task of comprehending speech while wearing a face mask is made more difficult by the mask's frequency response and radiation qualities, which is vary depending on the material and design of the mask. In this study, we recorded speech while wearing a face mask to see how different masks affected a state-of-the-art text-independent speaker verification system using an i-vector speaker identification system. This research investigates the influence of facial coverings on speaker verification. To address this, we investigated the effect of fabric masks on speaker identification in a cafeteria setting. These results present preliminary speaker recognition rates as well as mask verification trials. The result shows that masks had little to no effect in low background noise, with an EER of 2.4-2.5% in 20 dB SNR for both masks compared to no mask at the same level. In noisy conditions, accuracy was 12.7-13.0% lowers than without a mask with a 5 dB SNR, indicating that while different masks perform similarly in low background noise levels, they become more noticeable in high noise levels.

2.
Comput Biol Med ; 149: 105926, 2022 10.
Article in English | MEDLINE | ID: covidwho-2035907

ABSTRACT

This study proposes depression detection systems based on the i-vector framework for classifying speakers as depressed or healthy and predicting depression levels according to the Beck Depression Inventory-II (BDI-II). Linear and non-linear speech features are investigated as front-end features to i-vectors. To take advantage of the complementary effects of features, i-vector systems based on linear and non-linear features are combined through the decision-level fusion. Variability compensation techniques, such as Linear Discriminant Analysis (LDA) and Within-Class Covariance Normalization (WCCN), are widely used to reduce unwanted variabilities. A more generalizable technique than the LDA is required when limited training data are available. We employ a support vector discriminant analysis (SVDA) technique that uses the boundary of classes to find discriminatory directions to address this problem. Experiments conducted on the 2014 Audio-Visual Emotion Challenge and Workshop (AVEC 2014) depression database indicate that the best accuracy improvement obtained using SVDA is about 15.15% compared to the uncompensated i-vectors. In all cases, experimental results confirm that the decision-level fusion of i-vector systems based on three feature sets, TEO-CB-Auto-Env+Δ, Glottal+Δ, and MFCC+Δ+ΔΔ, achieves the best results. This fusion significantly improves classifying results, yielding an accuracy of 90%. The combination of SVDA-transformed BDI-II score prediction systems based on these three feature sets achieved RMSE and MAE of 8.899 and 6.991, respectively, which means 29.18% and 30.34% improvements in RMSE and MAE, respectively, over the baseline system on the test partition. Furthermore, this proposed combination outperforms other audio-based studies available in the literature using the AVEC 2014 database.


Subject(s)
Depression , Speech , Databases, Factual , Depression/diagnosis , Discriminant Analysis , Emotions
SELECTION OF CITATIONS
SEARCH DETAIL